A study of high quality speech synthesis based on the analysis of the randomness in speech signals

نویسنده

Naofumi Aoki

چکیده

Randomness observed in human speech signals is considered to be a key factor in the naturalness of human speech. This research project has investigated the characteristics of several kinds of randomness observed in human speech signals phonated by normal speakers. Based on the results of the analysis, some advanced techniques for artificially reproducing such randomness were developed with the aim of enhancing the voice quality of synthesized speech. The types of randomness particularly investigated in this project were: (1) amplitude fluctuation, (2) period fluctuation, (3) waveform fluctuation, (4) random fractalness of the source signals obtained by linear predictive analysis, and (5) unvoiced characteristics, namely, aperiodicity observed in voiced consonants. Using their statistical characteristics, a simple model was made for these forms of randomness, and was evaluated how it could contribute to realize high quality speech synthesis systems based on the LPC (linear predictive coding) vocoder. Normal sustained vowels always contain a cyclic change of maximum peak amplitudes and pitch periods, even at those times when the values seem to be quite stable. This project investigated the statistical characteristics of the fluctuations that were particularly labeled amplitude fluctuation and period fluctuation, respectively. Since the frequency characteristics of these fluctuation sequences appeared to be roughly subject to a 1/f power law, the author reached the conclusion that amplitude and period fluctuation could be modeled as 1/f fluctuations for a preliminary model. Psychoacoustic experiments performed in this study indicated that the differences in the frequency characteristics of the amplitude and period fluctuation could potentially influence the voice quality of synthesized speech. Compared with 1/f (white noise), 1/f , and 1/f 3 fluctuation models, amplitude and period fluctuation modeled as 1/f fluctuations could produce voice quality which was more similar to that of human speech phonated by normal speakers. Normal sustained vowels also always contain a cyclic change of the waveform itself, even during their most steady parts. This project investigated the statistical characteristics of the waveform fluctuations extracted from the residual signals of the LPC vocoder. Since the frequency characteristics of the waveform fluctuations appeared to be subject to a 1/f power law, the author reached the conclusion that the waveform fluctuations could be modeled as 1/f fluctuations for a preliminary model. Psychoacoustic experiments performed in this study indicated that the differences in the frequency characteristics of waveform fluctuations could potentially influence the voice quality of synthesized speech. Compared with 1/f 0 (white noise), 1/f , and 1/f 3 fluctuation models, waveform fluctu-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Speech Enhancement Through an Optimized Subspace Division Technique

The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...

متن کامل

Speech Enhancement Through an Optimized Subspace Division Technique

متن کامل

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

The Effect of Private Speech and Self-Regulation on Translation Quality among Iranian Translation Students: A Mixed-Methods Study

The current study presents findings from a mixed-methods study of investigating the self-regulatory role of private speech (self-talk) on students’ translation quality. The aim of the study was to validate the adapted version of a self-verbalization questionnaire. The construct validity and reliability of the scale were supported by the CFA which revealed that all items reached the acceptable f...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

A study of high quality speech synthesis based on the analysis of the randomness in speech signals

نویسنده

چکیده

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Speech Enhancement Through an Optimized Subspace Division Technique

Speech Enhancement Through an Optimized Subspace Division Technique

Speech Enhancement using Adaptive Data-Based Dictionary Learning

The Effect of Private Speech and Self-Regulation on Translation Quality among Iranian Translation Students: A Mixed-Methods Study

عنوان ژورنال:

اشتراک گذاری